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Abstract 

Newton, in an unauthorized textbook, described a process for solving simultaneous equations that later au- 
thors applied specifically to linear equations. This method — that Newton did not want to publish, that Euler 
did not recommend, that Legendre called "ordinary," and that Gauss called "common" — is now named after 
Gauss: "Gaussian" elimination. (One suspects, he would not be amused.) Gauss's name became associated 
with elimination through the adoption, by professional computers, of a specialized notation that Gauss de- 
vised for his own least squares calculations. The notation allowed elimination to be viewed as a sequence of 
arithmetic operations that were repeatedly optimized for hand computing and eventually were described by 
matrices. 

In einem unautorisierten Textbuch beschreibt Newton den Prozess fur die Losung von simultanen Gleichun- 
gen, den spatere Autoren speziell fur lineare Gleichungen anwandten. Diese Methode — welche Newton 
nicht veroffentlichen wollte, welche Euler nicht empfahl, welche Legendre "ordinaire" nannte, und welche 
GauB "gewohnlich" nannte — wird nun nach GauB benannt: GauBsches Eliminationsverfahren. (Man ver- 
mutet, er ware dariiber nicht amiisiert.) Die Verbindung des GauBschen Namens mit Elimination wurde 
dadurch hervorgebracht, dass professionelle Rechner eine Notation ubernahmen, die GauB speziell fur seine 
eigenen Berechnungen der kleinsten Quadrate ersonnen hatte, welche zulieB, das Elimination als eine Se- 
quenz von arithmetischen Rechenoperationen betrachtet wurde, die wiederholt fur Handrechnungen opti- 
misiert wurden und schlieBlich auch durch Matrizen beschrieben wurden. 
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1. Lacking Both History and Heritage 

The relocation of scientific research from academies to universities in the 19 th century [34] increased 
employment for mathematicians as teachers, which Grattan-Guinness Il87l p. 177] notes both sped the pro- 
fessionalization of the subject and coincided with a preference for pure over applied mathematics. The same 
taste was manifest in historical scholarship, in that pure subjects became more thoroughly chronicled than 
applications. For example, at the beginning of the century, the notoriety of calculating where again to observe 
Ceres earned the youthful Gauss fame enough to realize his wish for a life free of teaching pure mathemat- 
ics, see Biihler [25 p. 46] and Dunnington [52, pp. 405^-10], yet by the end of that century, mathematical 
histories neglected Gauss's applied work, see Cajori [27 1 and Matthiessen B135L 

Grattan-Guinness (op. cit.) argues for the existence of two recollections of the mathematical past: history 
recounts the development of ideas in the context of contemporary associations, while heritage remembers 
reinterpreted work that embodies the state of mathematical knowledge. The thesis of this paper is that much 
of genuinely applicable mathematics lacks both: no history because applications may be deemed the purview 
of non-mathematical faculties, no heritage because applications might not enter or remain in the corpus. This 
thesis is illustrated by the neglected story of the algorithm now called Gaussian elimination. It belongs as 
much to the history of science and technology as to the intellectual history of mathematics. 
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2. Gaussian Elimination Today 

Both elementary and advanced textbooks discuss an algorithm called Gaussian elimination. A first course 
in algebra may solve two linear equations in two unknowns by various means, whereas precalculus algebra 
invariably introduces a method for solving arbitrarily large systems of linear equations. As explained by 
Cohen et al. Il38l pp. 743+, sec. 10.2], "elementary operations" produce an "equivalent system" in "upper- 
triangular form" that can be solved by "back-substitution." 

x + 2y + z— 3 x + 2y + z- 3 x + 2y + z — 3 z — 2 

x + y + 2z=9=> -y + z- 6 => -y + z=6=>)> = -4 (1) 

2x+y + z=\6 -3y-z=10 -4z = -8 x = 9 

This paper follows current usage by referring to any algorithm that is essentially equivalent to equation ([T| 
as "Gaussian elimination" whatever its period or source. The distinguishing features are: the equations 
and variables may need to be rearranged so the leading equation contains the leading variable, the leading 
equation is used to remove the leading variable from each of those following, these steps apply recursively to 
the following (modified) equations until, finally, the back-substitution. The form of the algorithm employed 
in equation ([T]) is viewed as canonical: the leading equation remains unchanged while variables are removed 
by subtracting an appropriate multiple of it from each following equation. 

Petersen and Arbenz [ 150, p. 107] explain this algorithm is the standard test for the speed of computers in 
scientific work. Its widespread use in so large a field as scientific computing results in many algorithmic vari- 
ants that are collectively called, simply, Gaussian elimination. The variations are distinguished by acronyms, 
adjectives, and eponyms. At that level of differentiation the canonical algorithm of equation ([T| is also named 
either "classic" Gaussian elimination or "Doolittle's method." Nevertheless, advanced or specialized texts 
always begin by stating exactly this algorithm identified as Gaussian elimination. For example see Duff et al. 
Il50l pp. 43+], Farebrother (61] pp. 3+], Golub and Van Loan [83] pp. 92+], Higham O pp. 158+], Petersen 
and Arbenz ifBUl pp. 23+], and Stewart lfl63l pp. 148+]. 

Today, the technical literature as well as textbooks of all levels encourage the inference that Gauss intro- 
duced the method of equation ([T]i and his usage was somehow remarkable compared to prior art. The justi- 
fications offered for the Gaussian appellation thus range from simple citation to careful indirection. Cohen 
et al. [38, p. 743] claim "Gauss used this technique" to analyze the orbit of Pallas [72 1 though "the essentials" 
already appeared in ancient Chinese texts. The prior use in China leads Katz 11141 p. 29] to add the qualifi- 
cation, "the method now known as Gaussian elimination," that nevertheless allows for an independent origin 
in the work of Gauss. Higham 11991 p. 187] attributes "the first published appearance of Guassian elimina- 
tion" to Gauss but in an earlier paper ITD . Stewart B163I p. 148] notes Gauss in reality eliminated variables 
from quadratic forms rather than from linear equations, but he suggests the method of equation ([T]) stems di- 
rectly from Gauss because it resembles his "original derivation." Only Farebrother [61 , p. 3] leaves open the 
possibility of prior European origin, remarking that "Gauss's formalization" of the "traditional schoolbook 
method" appeared in his Pallas work. His explanation begs the questions: what schoolbook did Gauss read, 
and what did he contribute? 

3. Elimination Before Gauss 

Algebra and its history are invariably of interest but more so for the polynomial equations that gave 
rise to incommensurate and imaginary numbers. Consideration of simultaneous linear equations is therefore 
comparatively rare in the secondary literature and also in the primary sources. Both are surveyed here for 
systems of linear equations in periods even much before Gauss. 

By far the most impressive treatment known from antiquity is chapter 8 of the Nine Chapters of the 
Mathematical Art, a problem "book" anonymously and collectively written in China. Liu Hui wrote the 
first of several commentaries in the 3 rd century, including comments on chapter 8, so the method described 
there for solving systems of linear equations is at least as old, and is believed to be much older, although no 
venerable text survives. Martzloff [134, pp. 128-131] explains chapters 1-5 are known from a 13 th century 
copy, and chapters 6-9 are reconstructed from 18 th century quotations of a lost 15 th century encyclopedia. 

Mathematicians in ancient China represented numbers by counting rods. They organized elaborate cal- 
culations by placing the rods inside squares arranged in a rectangle. For chapter 8, each column of squares 
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3 rd Century BC: Problem 19 in book I of the Arithmetica of Diophantus 1961 p. 136]: find four numbers such that the 
sum of any three exceeds the fourth by a given amount. Solution using symbols for clarity: let n t be the numbers, 5 the 
sum, and d t the differences. Then s - n, is the sum of the others, so (s - «,•) - iij = d t or «, = (s - dj)/2. Summing 
s = n\ + n 2 + «3 + »4 = 2s - (d t +d 2 +di+ dt)l2 hence s = (di + d 2 +d^ + d^)/2. Thus s can be evaluated from the given 
data, and then so can the n,. 



■ ni + n 2 + n 3 + 114 = d[ 

n \ - n 2 + n 3 + "4 = d 2 _ d\ + U2 + • • • + d n di 

rii + n 2 - ni + ri4 = di ' 4 2 

tl\ + n 2 + «3 - «4 = da, 



(2) 



Before 3 rd Century AD: Problem 1 in chapter 8 of the Nine Chapters 11131 pp. 391, 399-403]: from 3 top-grade rice 
paddies, 2-medium grade, and 1 low-grade, the combined yield is 39 dou of grain, etc. What is the yield of a paddy of 
each grade? Solution: 



123 003 003 
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3 * + ^ + Z = 39 2 3 2 4 5 2 5 2 

2, + 3v + z = 34 « 3 36 y = ? (3) 

x + 2y + 3z = 26 v = 32 

7 26 34 39 39 24 39 99 24 39 ■» 

v Century AD: Problem 29 in chapter 2 of the Aryabhatiya of Aryabhata 1351 p. 40]: to find several numbers when the 
results of subtracting each from their sum are known. Solution: sum the known differences and divide by the quantity of 
terms less one. The result is the sum of all the numbers, from which they can then be determined. 

+ n.2 + «3 + • • • + n n = d\ 
ri\ + + /13 H v n„ = d 2 

ni+n2 + +... + „„ = d 3 ^ = d l+ d 2 + --- + d n _^ 

n - 1 

n\ + n 2 + iij, + ■ ■ ■ + = d„ 



Figure 1 : Ancient problems with their solutions couched in modern symbolic algebra and their interpretations as simultaneous linear 
equations. 



corresponds to a modern linear equation, so the ancient rectangle must be rotated counterclockwise by 90 de- 
grees to obtain the coefficient tableau of modern mathematics. Problem 1 in chapter 8 is frequently displayed 
as representative of the solution method, see equation p) in Figure [T] Chapter 8 solves 18 different systems 
of equations in this systematic way. Unlike equation (nl), both columns to be combined are scaled by the 
leading number in the other. Subtracting the right column from the left column removes the leading number 
from the left while preserving integer coefficients. For more on chapter 8 see Lay-Yong and Kangshen 11261 . 
The Nine Chapters appears to be the only general discussion of what can be interpreted as systems of linear 
equations before the 18 th century. 

Assertions that other ancient civilizations had mathematicians who solved linear systems do not bear close 
scrutiny. For example, O'Connor and Robertson 111441 . without citation, attribute to Babylonians "around 
300 BC" a problem about two fields of grain. Incredibly, this problem of unknown provenance is now used 
to prepare teachers in California, see National Evaluation Systems 01371 p. 9]. Besides misdating ancient 
Babylon to Hellenistic or Parthian Mesopotamia, the problem of O'Connor and Robertson is inconsistent with 
the tablets quoted by [ 105 1 and summarized by Bashmakova and Smirnova |fT3l p. 3]. When Babylonians did 
pose problems that can be interpreted as simultaneous equations, invariably at least one equation is nonlinear, 
reflecting an understandable Babylonian interest in areas of fields given knowledge of diagonals, perimeters, 
and the like. 

Some claims, that the ancients solved simultaneous linear equations, ignore evidence that special solution 
methods were used. For example, Dedron and Itard [44, p. 303] see a linear system in a problem about 5 
men and 100 loaves of bread from the Rhind papyrus. Gillings 1791 pp. 170-172] discusses this problem 
(number 40) and the solution that is recorded in the papyrus which is based not on linear equations but rather 
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on arithmetic progressions. A knowledge of both arithmetic and geometric progressions, Gillings argues, 
was a distinguishing feature of ancient Egyptian mathematics. Several problems solved by Diophantus of 
Alexandria can be interpreted as simultaneous linear equations. Equation (|2]i in Figure[T]is the most elaborate 
of these. This problem is a special case that happens to be solvable by a repetitive formula and therefore is 
amenable to special reasoning. Hermann Hankel 11921 p. 165], a prominent mathematician who also wrote 
on the history of mathematics, remarked that studying 100 Diophantine solutions would not suggest how to 
solve the 101 st problem^ A class of problems solved by Aryabhata in India is similarly special, see equation 
Q in Figure [T] The two problems stated in this figure appear to be representative of all the simultaneous 
linear equations known from the periods and regions of Diophantus and Aryabhata. 

The equivalent of single polynomial equations, but notably not simultaneous equations, were solved by 
several Arabic-speaking mathematicians in medieval times. Examples are in the work of the encyclopedist 
Al-Khwarizmi, from whom we have the words algebra and algorithm, and in the writings of Leonardo of 
Pisa, also known as Fibonacci, who travelled in Arabic-speaking lands. Hogendijk 1 100 1 regards knowledge 
of the medieval period "by no means complete" because many Arabic scientific manuscripts have not been 
studied. 

Bashmakova and Smirnova [ 13] survey the European algebraic tradition and its sources from the ancient 
to the abstract. Symbolic algebra developed during the European Renaissance in the arithmetization of ge- 
ometry and in the theory of equations. That undertaking was more than a stepwise development culminating 
in modern notation. Schmidt [159| explains the conceptual differences that separated Viete from the later, 
root-oriented theory of equations of Descartes and others. For these authors in translation see Descartes [46], 
Schmidt 111581 . and Vieta 117511 . By the end of the 16 th century an audience had developed for textbooks that 
taught arithmetic, how to express "questions" in terms of symbolic equations, and the solution thereof. 

To obtain a comprehensive picture of algebra books for education, Kloyda 11 161 surveyed 107 texts 
printed between 1550 and 1660. These number 9 Spanish, 12 English, 19 each French and Italian, and 24 
each from Germany and the Netherlands. Only 4 of the 107 texts discussed simultaneous linear equations. 
Pelletier du Mans [ 149 1 has the earliest example, a problem said to have originated with Cardano, about three 
men with three sums of money. After explaining Cardano's solution, Pelletier solved the problem by directly 
manipulating equations, restated in modern notation in equation |5]) of Figure [2] Pelletier wrote an infix "p." 
(piu) for modern +, also "m." (meno) for -, and for equality he wrote the word. (This notation rather than 
being the author's choice may of course have been imposed by limited typography.) The solution is obtained 
from the 10 th , 5 th , and 1 st equations. Buteo |26| presented a more direct solution of a similar problem, again 
restated in modern notation in equation (|6]l of Figure [2] Buteo wrote "." or "," for +, and "[" for =. He 
performed the same double-multiply elimination as the Nine Chapters in equation ([3j, although Buteo used 
different equations for the back-substitution. Here the solution is from the 6 th , 5 th , and 3 rd equations whereas 
the Nine Chapters and the canonical elimination of equation ([!} would retain the 1 st , 4 th , and 6 th . Gosselin 
11851 solved a problem with four equations by similar methods, being more or less direct, and Rahn II 1531 did 
the same for three equations. 

4. "this bee omitted by all that have writ introductions" 

The 18 th century produced several more algebra books. A cursory inspection finds 35 printed in England 
alone from 1650 to 1750, including one published over the objections of Isaac Newtonj^] Whiteside ||181||182Tl 
describes Newton's work on algebra which extended roughly from his appointment to the Lucasian professor- 
ship in 1669 until he began composing the "Principia" in 1684. Much of this work dealt with investigations 
of algebraic curves, but Newton also addressed the theory of equations. No publications resulted immediately 
from the latter efforts. In 1669-1670 Newton wrote "observations" and amendments for a Latin version of 



'Heath |96 pp. 54-58] translates Hankel's comments and also cites Euler's opinion on the generality of the methods used by Dio- 
phantus. 

2 Apart from Newton's, the books are: 1650 Moore, 1652 Oughtred, 1653 Balam, 1660 Leybourn, 1663 Brasser, Petri, and Backer, 
1669 Renaldini, 1673 Kersey, 1680 Perkins, 1685 Wallis and Caswell, 1698 Ward, 1700 Moxon and Tuttell, 1702 Cocker, 1702 Harris, 
1705 Parsons and Wastell, 1706 de Graaf, 1707 Berkeley, 1709 Alexander, Ditton, and Cobb, 1711 Ozanam, 1717 Kersey and Halley, 
1728 Royer, 1728 Jacob, 1737 Ashby, 1738 Ronayne, 1739 Wolff, 1739 Hanna, 1740 Webster, 1741 Saunderson, 1742 Hammond, 1745 
Simpson, 1746 Crosby, Wilcox, and Clark, 1748 MacLaurin, 1748 Muller, 1749 Holliday, 1750 Fenning, 1750 Loughton and Bickham. 
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16 th Century AD: Problem of Pelletier du Mans 1 149 1: 







1. 


2i? + A + B = 


64 






2. 


R + 3A + B = 


84 






3. 


R+A + 4B = 


124 


2 nd + 3 rd 


=> 


4. 


2R + 4A + 5B = 


206 


4* _ pt 


=> 


5. 


3A + 4B = 


146 


1 st + 2 nd 


=> 


6. 


3R + 4A + 2B = 


148 


1 st + 3 rd 


=> 


7. 


3R + 2A + 5B = 


188 


6 th + 7 th 


=> 


8. 


6R + 6A + 1B = 


336 


6 x 3 rd 


=> 


9. 


6R + 6A + 24B = 


744 


9 th _ g th 


=> 


10. 


17fi = 


408 



(5) 



16 th Century AD: Problem of Buteo (26): 

1. 3A + B + C = 42 

2. A + 4B + C = 32 

3. A + fi + 5C = 40 

3 x 2 1 * 1 - l sl => 4. llfi + 2C = 54 

3 x 3 rd - l sl ^5. 2B+ 14C = 78 

11 x5 lh - 2 x4 th => 6. 150C = 750 



Figure 2: European Renaissance problems with their solutions stated in modern notation from the compilation by Kloyda II 161 . 



a Dutch algebra text, Kinckhuysen II 151 . that his acquaintance John Collins planned to publish in England. 
Collins abandoned the project when newer books appeared. Newton himself taught algebra at Cambridge 
for 11 years beginning with the 1673-1674 academic term. During that time he wrote and repeatedly re- 
vised an incomplete manuscript for his own algebra treatise that was to be named "Arithmeticae Universalis." 
His last algebra manuscript was prepared in 1684 when, for unknown reasons, Newton suddenly honored 
the requirements of the Lucasian professorship by depositing with Cambridge University his lectures for the 
algebra course. The bulk of those notes were transcribed by his secretary, Humphrey Newton (no relation), 
from Newton's previous algebra manuscripts. After Newton left academic life, his lectures were published 
in their original Latin (1707, 1722) and in translation (1720, 1728) under the intended title of his aborted 
treatise, "Universal Arithmetic." Newton had no claim to material that the university had paid him to prepare, 
nevertheless he strongly objected to its publication, as explained by Whiteside 11821 v. 5, p. 11], lest the old 
lecture notes be misinterpreted as representing his latest research. The second English edition with Newton's 
changes appeared the year after his death. 

In the realm of unintended consequences it is to be anticipated that Newton's comparatively accessible 
algebra textbook became, as characterized by Whiteside II 1821 v. 5, pp. 54-55, fn. 1], "the most widely read 
and influential of his writings." Thanks to Whiteside's impressive scholarship, a passage that is relevant 
to Gaussian elimination can be traced directly to Newton in the commentary on Kinckhuysen's textbook. 
Newton remarked to Collins that contemporary textbooks lacked an explicit description of how to solve 
collections of equations. 

Though this bee omitted by all that have writ introductions to this Art, yet I judge it very propper 
& necessary to make an introduction compleate. 

— Isaac Newton, margin note to John Collins, 
quoted by Whiteside flJS v. 2, p. 400, n. 62] 

Therefore Newton proposed to insert a new chapter that first explained the overall strategy of solving simul- 
taneous equations and then listed the tactics by which it might be accomplishedrl 



3 The new chapter for Kinckhuysen's textbook can be found in Newton's original [182 v. 2, pp. 400-41 1, parallel Latin and English 
texts]. The material also appears in the transcribed lecture notes 11821 v. 5, pp. 122-129, parallel texts]. That text was copied into the 
Latin and English editions of the unauthorized textbook, whose second English edition has been reproduced 11811 v. 2, pp. 36-38]. 
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Of the Transformation of two or more Equations into one, in order to exterminate the 
unknown Quantities]^] 

When, in the Solution of any Problem, there are more Equations than one to comprehend 
the State of the Question, in each of which there are several unknown Quantities; those Equa- 
tions (two by two, if there are more than two) are to be so connected, that one of the unknown 

Quantities may be made to vanish at each of the Operations, and so produce a new equation 

And you are to know, that by each Equation one unknown Quantity may be taken away, and 
consequently, when there are as many Equations and unknown Quantities, all at length may be 
reduc'd into one, in which there shall be only one Quantity unknown. — Newton [142, p. 60] 
and prior 

On the pages following this rule, Newton II 1421 p. 61-62] offered several methods for removing a variable 
from two equations, including "equating" and "substituting." 

The Extermination of an unknown Quantity by an Equality of its Values. 

When the quantity to be exterminated is only of one Dimension in both Equations, both its 
Values are to be sought by the Rules already delivered, and the one made equal to the other. 

Thus, putting a + x = b + y and 2x + y = 3b, that y may be exterminated, the first Equation 
will give a + x - b — y, and the second will give 3b - 2x = y. Therefore a + x — b-3b- 2x, . . . 

The Extermination of an unknown Quantity, by substituting its Value for it. 

When, at least, in one of the Equations the Quantity to be exterminated is only of one Di- 
mension, its Value is to be sought in that Equation, and then to be substituted in its room in the 
other Equation. ... — Newton II 1421 pp. 61-62] and prior 

The context and accompanying examples make clear Newton meant a general approach for solving simulta- 
neous nonlinear equations. Indeed, Newton [142| considers simultaneous linear equations only in the illus- 
tration quoted above. In all his work, the only system of 3 or more linear equations appears to be a single, 
contrived example in the manuscript of his incomplete treatise, see Whiteside 1 182] v. 5, p. 567, problem 3]. 
Many of Newton's exercises are motivated by Cartesian geometry as he practiced it or by classical physics as 
he invented it, and therefore are apparently original, and they contain no simultaneous linear equations. 

In a study of algebra education in the 16 th through the 18 th centuries, Macomber II 1 3 1 1 p. 132] finds 
Newton's general rule for solving simultaneous equations was "the earliest appearance of this method on 
record." Moreover, (op. cit., pp. 143-144) "before the death of Newton there came to be a demand for suitable 
text books of algebra for the public schools; and during the 18 th century, a number of texts appeared, all more 
closely resembling the algebra of Newton than those of earlier writers." Among the authors Macomber finds 
Newton influenced was the banker Nathaniel Hammond. He served as chief accountant for the Bank of 
England from 1760 to 1768 [154|, and his successful algebra textbook went to four editions between 1742 
and 1772. Hammond's interesting introduction summarized algebraic history from ancient times to his own, 
as he understood it, but his lessons got down to business by emphasizing clear instructions for solving word 
problems, which were dreaded even then. 

As the principal Difficulty in this Science, is acquiring the Knowledge of solving of Ques- 
tions, I have given a great Variety of these respect to Numbers and Geometry, and their solutions 
I chose to give in the most particular, distinct, and plain Manner; and for which the Reader will 
find full and explicit Directions. 

— Hammond I19T1 p. vii] 

Thus Hammond [91, pp. 142, 219-220, 296-297] divided Newton's rule for simultaneous equations into a 
progression of rules for two, three, and four equations. 



4 Whiteside 1 182 v. 2, p. 401, n. 63] notes that Newton originally wrote "elimino" and then replaced it in some instances by "exter- 
mino." Whiteside's translation may be preferred to the text in Newton 11421 . 
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The Method of resolving Questions, which contain four Equations, and four unknown 
Quantities. 

72. When the Question contains four Equations, and there are four unknown Quantities in each 
Equation; find the Value of one of the unknown Quantities in one of the given Equations, and for 
that unknown Quantity in the other three Equations write the Value of it, which then reduces the 
Question to three Equations, and three unknown Quantities. 

Then find the Value of one of these three unknown Quantities in one of these three Equations, 
and for that unknown Quantity in the other two Equations write the Value of it, which reduces 
the Question to two Equations, and two unknown Quantities. 

Then find the Value of one of the unknown Quantities in each of these two Equations, and 
make these Equations equal to one another, when we shall have an equation with only one un- 
known Quantity, which being reduced, will answer the Question. . . . 

And in the same Manner may any other Question in the like Circumstances be answered. — 
Hammond E0 pp. 296-298] 

Hammond [91 p. 142] echoed Newton's terminology of "exterminating an unknown Quantity," but unlike 
Newton he illustrated the progressive cases with many systems of linear equations. His method for removing 
variables from two equations was Newton's "equating values" whereas from three or more equations it was 
Newton's substitution. In marked contrast to both Newton and Hammond, a contemporary work by the 
sightless Lucasian professor, Saunderson [156| excerpted in Saunderson 11571 pp. 164+], solved the three- 
equation problem of Cardano-Pelletier without stating a general rule for simultaneous equations. 

Euler [58 1 also wrote an algebra textbook that was much admired for its concise style. By then totally 
blind himself, Euler begins with the compelling testimonial that the book was dictated for the instruction of 
his sight secretary, who mastered the subject from the text without additional instruction]^] Euler included a 
chapter specifically for simultaneous linear equations. To find the values for two unknowns in two equations, 
Euler repeated the "equating values" method: 

The most natural method of proceeding ... is, to determine, from both equations, the value 
of one of the unknown quantities, as for example x, and to consider the equality of these two 
values; for then we shall have an equation, in which the unknown quantity y will be found by 

itself Then, knowing y, we shall only have to substitute its value in one of the quantities that 

express x. 

— Euler |58] part 2, sec. 1, chap. 4, sec. 45] translated in J59] p. 206] 

Euler continued with "equating values" for three equations. However, he cautioned against adopting a rote 
approach, and therefore did not state a general algorithm. 

If there were more than three unknown quantities to determine, and as many equations to 
resolve, we should proceed in the same manner; but the calculation would often prove very 
tedious. It is proper, therefore, to remark, that, in each particular case, means may always be 
discovered of greatly facilitating the solution. 

— Euler (58 , part 2, sec. 1, chap. 4, sec. 53] translated in [59 p. 211] 

Lacroix 111201 wrote another in this growing series of algebra textbooks. Remembered as a minor math- 
ematician today, he was a member of the reconstituted French Academy, and was recognized as an astute 
author. His masterpiece, Traite du calcul differentiel et du calcul integral, remained in print as the standard 
reference for 18 th century calculus even after Cauchy began to reinvent the foundations. Domingues l47l p. 3] 
comments that Lacroix sought to compare different approaches and to present the best in an original, uniform 
style that benefits the reader though it may obscure the origin of the material. A contemporary book review 
[ 171 ] presumed the principal sources for Lacroix's Elemens d'algebre were Bezout, Euler, and Lagrange. To 
this group perhaps should be added Hammond. 



5 Heefer [97 1 traces most of Euler's exercises to an algebra textbook by Christoff Rudolff in 1525 and reprinted by Michael Stifel in 
1554. Kloyda [116 p. 132] sheds more light on the source of exercises by remarking, without citation, that Rudolff also published a 
problem book that went through several editions. 



7 



Lacroix began with lessons specifically for linear equations. These underwent considerable revision be- 
tween his 2 nd and 5 th editions published in 1800 and 1804, respectively. The 2 nd edition discussed one and 
two unknowns in the same number of equations, and then passed to a derivation of explicit formulas for the 
unknowns in systems of two, three, and even four equations 111201 pp. 79-104, sec. 75-91]. Between these 
two rather different lessons, the 5 th edition inserted a clear statement of how to solve simultaneous linear 
equations. The subscript-free notation of the time did not allow easy expression of arbitrarily many equations 
and unknowns, but in his lesson title Lacroix made it plain the method could be applied to any number. 

Of the resolution of any given number of Equations of the First Degree, containing an equal 
number of Unknown Quantities. 

78. ... if these unknown quantities are only of the first degree, [then] according to the method 
adopted in the preceding articles, we take in one of the equations the value of one of the unknown 
quantities, as if all the rest were known, and substitute this value in all the other equations, which 
will then contain only the other unknown quantities. 

This operation, by which we exterminate one of the unknown quantities, is called elimina- 
tion^ In this way, if we have three equations with three unknown quantities, we deduce from 
them two equations with two unknown quantities, which are to be treated as above; and having 
obtained the values of the two last unknown quantities, we substitute them in the expression for 
the value of the first unknown quantity. 

If we have four equations with four unknown quantities, we deduce from them, in the first 
place, three equations with three unknown quantities, which are to be treated in the manner just 
described; having found the values of the three unknown quantities, we substitute them in the 
expression for the value of the first, and so on. 

— Lacroix H 12 H p. 1 14, art. 78, original emphasis] 
translated in QUI P- 89, art. 78] 

Lacroix followed Hammond in using what Newton called substitution to eliminate variables. 

5. "Without any desire to do things which are useful" 

The first-ever use for simultaneous linear equations was the method of least squares, invented just when 
Lacroix wrote his textbook, at the start of the 19 th century. It appears to be difficult to show there was any need 
to solve elaborate equations of any kind, before then. None of the secondary literature argues that solving 
algebraic equations was needed, and several authors intimate to the contrary. Libbrecht [ 129, p. 416] values 
the information about daily life in ancient Chinese mathematics lessons, but however colorfully those word 
problems may have been composed, it is plain the exercises in chapter 8 of the Nine Chapters are contrived. 
Katz [114] finds a similar artificiality to algebra problems in Babylonian, Greek, Arabic, and European Re- 
naissance texts. Neugebauer 111391 pp. 71-72] points out that ancient economies required only arithmetic to 
function. Hogendijk [ 100 1 explains that Islamic civilization needed arithmetic for commerce and advanced 
mathematics for astronomy, but the mathematical studies that in hindsight were the most sophisticated, such 
as algebra, were undertaken for their own sake. 

Few memorable discoveries have ever been made for their immediate utility, so there is no reason to 
suppose either our very distant predecessors or the inventors of algebra should have been any different. "By 
and large it is uniformly true in mathematics that there is a time lapse between a mathematical discovery and 
the moment when it is useful," John von Neumann II 1401 opined in his mannered prose, and in the meantime, 
"the whole system seems to function without any direction, without any reference to usefulness, and without 
any desire to do things which are useful." Thus it is unusual when a discovery is immediately useful, as was 
the method of least squares, which finally created a need for solving simultaneous linear equations. 



'Lacroix's translator introduced Newton's word "exterminate." 
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6. "Methode des moindres quarres" 



The genesis of least squares lies in a scientific question that was resolved unsatisfactorily in the 18 cen- 
tury. Making accurate predictions from measurements tested preconceptions about the relationship between 
mathematics and the sciences. Differences between astronomical observations and orbital formulas, derived 
from Newton's principles, at times cast doubt even on the inverse square law of gravitation^] Both Euler 
and Laplace speculated the law of gravity might need modification for astronomical distances. Stigler [ 164 , 
pp. 17, 28] hypothesizes that fitting the orbital formulas was inconceivable for lack of conceptual grounds 
to justify amalgamating the errors of separate measurements. A new paradigm was adopted once Tobias 
Mayer successfully applied ad hoc fitting methods to predict the lunar orbit. Laplace in particular then de- 
rived fitted orbits that both vindicated Newton and established the mathematics Laplace employed, analysis!^ 




Farebrother [62] surveys the several fitting methods available at the end of the 18 century. These successes 
begged the question, what was the best fitting method to use? 

Two inventors are recognized for the method of least squares. Gauss claimed to have known "that the sum 
of the squares should be minimized," since 1794 or 1795|j Thus when the dwarf planet Ceres was sighted 
and lost in 1801, he quickly found its orbit by procedures that included least squares methods. Gauss was 
reticent about the calculations in the hasty announcement of the Ceres orbit. An explanation written in 1802 
was sent to a friend, Olbers, and inexplicably returned only in 1805, yet all the while Gauss was publishing 
orbits for Ceres and other celestial bodies ll52l pp. 53, 420^-21]. 

Meanwhile, in an appendix to a long paper about geodesy, Legendre 11271 pp. 72-74] posed the general 
problem of finding the most accurate parameterization furnished by a given set of observations. Stigler 
01641 pp. 13, 15] recommends Legendre's short text as among the most elegant introductions of a significant 
mathematical concept. Legendre noted, the problem often involves many systems of equations each of the 
form, 



(his notation) where a, b, c, f, . . . are numbers that vary among the equations, and x, y, z, ■■ ■ are param- 
eters common to all the equations. (As in linear models today, the numbers in each equation represent one 
observation; the model parameters are the unknowns.) Legendre viewed the problem as finding values for 
the parameters to make E small. If the number of equations exceeds that of the unknowns (so the equations 
cannot be solved exactly), then he suggested minimizing the sum of the squares of the E's. He called this 
overall process the methode des moindres quarres (modern carres). The solution was found by differentiating 
the sum of squares to derive the "equations of the minimum" (modern normal equations). These simultane- 
ous equations were linear and equinumerous with the variables, so, Legendre said, they could be solved by 
"ordinary methods." 

Over a decade passed between Gauss's own discovery and his publication. Although he intended to 
publish immediately after Legendre, the manuscript was delayed by the Napoleonic wars. In Theoria Motus, 
Gauss explained a refined process for orbital calculations, and he returned to the conceptual problem of fifty 
years earlier: how to justify values calculated from erroneous data. Lacking justification, Gauss IP7T1 art. 
186] intimated, the only reason to minimize squares was convenience. Rather than begin by minimizing the 
discrepancy in the equations as Legendre had, instead Gauss ll7Tl art. 175] formulated "the expectation or 
probability that all these values will result together from observation." Assuming the errors in the unknowns 
follow a Gaussian or normal distribution (later terminology), Gauss showed maximizing the expectation is 
what implies the sum of squares should be minimized. Gauss then echoed Legendre's sentiment about the 
resulting calculation to find the parameters. 

We have, therefore, as many linear equations as there are unknown quantities to be determined, 
from which the values of the latter will be obtained by common elimination. (. . . per elimina- 
tionem vulgarem elicientur.) 



7 See Gillispie (80] p. 29], Goldstine (82] p. 142], and Stigler fl64l p. 30] for elaboration. 
8 Hawkins ([95] sect. 2] summarizes the status of analysis at the time of Lagrange and Laplace. 

'Gauss |71 art. 186] said 1795, but Plackett 1 151 p. 241] quotes Gauss recalling 1794. Dunnington [52 p. 469] reports Gauss began 
his famous diary of mathematical discoveries only in 1796. 




E — a + bx + cy + fx + etc., 



— Gauss ED art. 180], this translation (1857) 
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There followed a scientific dialogue in published work of Gauss and Laplace that explored the nature of 
probability and estimation^] Gauss [74] gave a second and unqualified justification for the method of least 
squares that is now the Gauss-Markov theorem for the minimum variance linear unbiased estimator. Hald [90 , 
pp. 98, 105-109] suggests few contemporary readers, if any, understood all that Gauss and Laplace wrote. 
Nevertheless, he continues, Gauss's first proof for the method of least squares, coupled with the sufficiency 
in many instances of the assumption of normally distributed errors, allowed the likes of Hagen, Chauvenet, 
and Merriman to maintain and extend a statistically respectable methodology of estimation from the time 
of Gauss through the development of modem statistics. Mayer, Legendre, Laplace, and Gauss — those 
on whose shoulders we stand — each in their way contributed to the last mathematical prerequisite for the 
industrial revolution by reconciling experimental uncertainty with Newton's deterministic physics to create 
the predictive models needed for engineering design. 

At the end of the 19 th century, Bartlett ifTTI p. 1] could proudly announce that "scientific investigations 
of all kinds" relied on a mature computational technology called "The Adjustment of Observations" or "The 
Method of Least Squares.'rMThe subject divided into two cases. 




case 1. The "adjustment of indirect observations" had Legendre's original, overdetermined equations that 
today would be stated as min A \\b - Ax\\2- These problems were solved by reducing them to A' Ax = c 
with c = A'b. See Wright and Hayford lfl86l chapt. 4] and Bartlett [12] sees. 23-32]. 

case 2. Gauss [75 ] formulated the "adjustment of conditioned observations" to find minimum norm solutions 
of underdetermined equations, min^,^ ||x||2. He reduced these problems to AA'u = b where x = A'u. 
See Wright and Hayford lfl86l chapt. 5] and Bartlett [ 12 sec. 33]. 

Matrices were not used in these formulations, note, until the mid 20 th century. Each row of the over- 
determined (case[T]l or under-determined (case [2]) equations, Ax = b, was called a condition. The reduced 
forms of both problems were called normal equations and were solved by elimination. Stigler 111651 pp. 415- 
420] reports that Gauss [73, p. 84] seemingly used this name first but also just once and offhand, so what 
Gauss meant by Normalgleichungen is unknown despite exhaustive scholarship. 

7. Gauss's Method 

Dunnington [52, pp. 227-228] explains that Gauss performed meticulous calculations almost as a leisure 
activity throughout his life. Gauss thought calculations so important that he included them in his papers where 
he strove to make them brief and intuitively clear. Accordingly, his publication that mentioned "common 
elimination" IP7T1 was followed with a detailed explanation [72 1 but of quadratic forms, not linear equations. 

Quadratic forms appeared in prior work of Lagrange, Legendre, and Gauss [ 70 art. 222] on number theory 
and also in a numerical context. In his very first paper, [ 125 1 substituted new variables for linear combinations 
of the original variables in a quadratic form, x'Ax in modern matrix notation, to give the entries of a triangular 
matrix U of substitution coefficients so that A = U'DU for a diagonal matrix D. This formula expressed the 
quadratic form as a weighted sum of squares, (Ux)' D(Ux), which Lagrange used to ascertain local extremaLj 




Toeplitz 111721 p. 102] and Wedderburn II 1 781 p. 68] remembered Lagrange for this representation of quadratic 
forms almost two hundred years later. Thus Gauss likely knew of the Lagrangian provenance as well. 

When Gauss first considered least squares in 1795, he borrowed from the Gottingen library the volume 
that begins with Lagrange's paper on extrema. Amazingly, the books Gauss borrowed as a student are known 
||52l p. 398]. There is no evidence he read the paper, yet the indication is it influenced his approach to least 
squares. For example, Gauss did not need the paper for his major work on number theory from this period 
which did cite the journal lP70l art. 202, fn. 9] but not the volume borrowed in 1795. Since neither lP70l nor 



Three short monographs with historical emphasis but on slightly different aspects of this work — statistical fitting procedures, 
parametric statistical inference, and classical analysis of variance — have recently been written by Farebrother 1 62 1, Hald [90], and [37], 
respectively. Histories with a wider scope should also be consulted by Gillispie [80 chap. 25], Goldstine |82 chaps. 4.10], and Stigler 
(TP] chap. 4]. 

"Bartlett 1 1 1] pp. 110-111] and (1915, pp. v-vi) lists English, French, German, and Italian textbooks from the end of the 19 th century. 
See Ghilani and Wolf |78| for a treatment from the beginning of the 21 s1 century. 

12 Since Lagrange's discovery seems not to have been incorporated in general algebra textbooks, it is tenuous to see, as some web 
pages do, a possible origin for Gaussian elimination in this work. 
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II 12511 dealt with forms of more than three variables [94 1, Gauss still had to systematically extend the treatment 
to arbitrarily many unknowns in the subscriptless notation of the time, which he finally did in the context of 
least squares j^j 

Gauss [72 1 wrote the overdetermined equations as 

n + a p + b q + c r + d s + . . . — w 
n' + a' p + b' q + c' r + d' s + . . . — w' 
n" + a" p + b" q + c" r + d" s + ... — w" 
etc. 

(original notation, but "etc." replaced by ". . . "). The symbols n, a, b, c, . . . , with or without primes, are 
numbers. The purpose is to find values for the variables p, q, r, . . . to minimize 

Q = ww + w'w' + w"w" + ... . 

Gauss introduced a bracket notation (unnamed by him, later called auxiliaries) 

[xy] — xy + x'y' + x"y" + ... , 

where the letter x either is y or lexicographically precedes y. This notation expressed the normal equations 
(name not yet introduced) as 

[an] + [aa] p + [ab] q + [ac] r + [ad] s + ... = 
[bn] + [ab] p + [bb] q + [be] r + [bd] s + ... = 
[en] + [ac] p + [be] q + [cc] r + [cd] S + ... = 
etc. 

As Legendre and he had done before, Gauss IT721 p. 22] again remarked these equations could be solved 
by elimination, but he did not explicitly perform that calculation. Instead, he noted the brackets give the 
coefficients of the variables in the sum of squares, a quadratic form. 

Q = [nri] + 2 [an] p + 2 [bn] q + 2 [cri] r + 2 [dn] s + . . . 
+ [aa] pp + 2 [ab] pq + 2 [ac] pr + 2 [ad] ps + ... 
+ [bb] qq + 2 [be] qr + 2 [bd] qs + . . . 
+ [cc] rr + 2 [cd] rs + . . . 
etc. 

Gauss extended his bracket notation to 

[ax] [ay] 



[xy, 1] = [xy]- 
[xy,2] = [xy, 1] - 
[xy, 3] = [xy, 2] - 



[aa] 

[bx,\][by,\] 

[bb, 1] (8) 
[cx,\][cy,\] 



[cc,2] 
etc. 

and so on. These values are the coefficients remaining in Q after successive variables have been grouped into 
perfect squares. The first of these combinations of variables, A, 

A — [an] + [aa]p + [ab]q + [ac]r + [ad]s + ... 
B = [bn,l] + [bb,l]q + [bc,l]r + [bd, l]s + ... 

C = [en, 2] + [cc,2]r + [cd,2]s + ... <A) 
etc., 



Farebrother [62 p. 161n] attributes double subscript notation to Cauchy in 1815, first used by Gauss in 1828. 
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simplifies the quadratic form by removing the variable p: 



A 2 

Q-- — - = [nn,l] + 2[bn,l]q + 2[cn, l]r + 2[dn,l]s + ... 
[aa] + [bb,l]qq + 2[bc,\]qr + 2[bd,\]qs + ... 
+ [cc, 1] rr + 2[cd, l]rs + ... 
etc. 

If this process is repeated with B, C, . . ., then eventually, 

A 2 B 2 C 2 

Q ~ T — T ~ 7ZTTT ~ F — ™ - ••• = inn,M\, 
[aa] [bb,l] [cc,2] 

where p is the quantity of variables. Each A, B, C, . . . has one fewer unknown than the preceding. Thus 
A — 0, B = 0, C = 0, . . . can be solved in reverse order to obtain the values for p, q, r, . . . , also in reverse 
order, at which Q attains its minimum, [nn,p]. In later theoretical discussions of least squares methods, such 
as Gauss ll75l art. 13], he always referred back to his 1811 paper for details of the calculations as transforming 
quadratic forms. 

Gauss's contributions to the method of least squares were known immediately. By 1819 even a gymna- 



sium prospectus, Paucker |148|, cited Legendre [ 127 1 and Gauss 17111 — but not Gauss for any particular 
solution method. Gauss's solution process that neatly tied together linear algebra, optimization theory, and 
his probabilistic justification for the method of least squares, seems to have been adopted as an algorithm 
slowly and by geodesists not by mathematicians. 



8. Geodesy for Cartography 

The original motivation for the computational developments in least squares was their use in two major 
scientific activities of the 19 th century. One was astronomy which was pursued for navigation and timekeeping 
besides its intrinsic interest. Gauss [71 1 described the methods he had invented to derive orbital formulas from 
a few observations. He applied them before and after 1 809 in many papers that constitute the bulk of his early 
work. Nievergelt 111431 explains the orbital calculations used in 19 th century astronomy, while Grier l89l 
explains the institutional history of computing groups in national observatories where the calculations were 
made. 

Another scientific activity continually and directly sponsored by governments was geodetic research for 
cartography. Indeed, the first scientific agency of the United States was the Coast Survey Office founded in 
1807, see Cajori [28]. Cartographers positioned major towns and landmarks, relative to one another, by using 
them as vertices in networks of triangles. Gauss became prominent in geodesy through the many papers he 
wrote during his protracted survey of Hanover. This small German kingdom, roughly coincident with modern 
Lower Saxony, was the ancestral possession of the British royal family, and was Gauss's home for most of his 
life. Dunnington ll52l chap. 10] relates that Bessel warned Gauss the survey toil detracted from his research. 
Nevertheless, although geodesy had inspired Legendre to invent the method of least squares, a survey officer 
explains it was from Gauss whom geodesists adopted the method. 

If in effecting a [survey] triangulation one observed only just so many angles as were absolutely 
necessary to fix all the points, there would be no difficulty in calculating [the locations]; only 
one result would be arrived at. But it is the invariable custom to observe more angles than are 
absolutely needed, and it is these supernumerary angles which give rise to complex calculations. 
Until the time of Gauss and Bessel computers had simply used their judgement as they best could 
as to how to employ and utilize the supernumerary angles; the principal of least squares showed 
that a system of corrections ought to be applied, one to each observed bearing or angle, such 
that subject to the condition of harmonizing the whole work, the sum of their squares should be 
an absolute minimum. The first grand development of this principle is contained in this work of 
Bessel's. — Clarke [36 pp. 26-27] 

Clarke refers to a Prussian triangulation that is still important in European cartography, wherein Bessel and 
Baeyer ifTBI p. 130] used the methods of Gauss ll75l . This endorsement in a survey for a major government 
drew the attention of other geodesists such as Clarke. 
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As Clarke explained, a critical step in making maps was to reconcile the slightly inconsistent angle mea- 
surements that were the raw data gathered by surveyors. Gauss [75, art. 22] stipulated the true angles satisfy 
three kinds of conditions: (1) the sum of angles around an interior vertex equals 2n, (2) the sum of angles in 
a triangle equals n plus a spherical correction, and (3) side conditions that chain together linearized sine laws 
for circuits of triangles with common edges The measured angles nearly satisfy these conditions, so the 
unknowns are perturbations intended to make the measurements true. Gauss formed a side condition from 
the triangles around each interior vertex, in which case (it is easily seen) even an ideal net consisting of / 
nonoverlapping triangles and v vertices has 3/ angles (if all are measured), but only 3/ - 2v + 4 conditions. 
Thus, in general, the adjustment problems were under- not overdetermined. In his last major theoretical work 
on least squares, Gauss ll75l introduced the solution method stated in case [2] He illustrated the method by 
readjusting a small part of the Dutch triangulation, see Figures [3] and [4] In comparison the British Isles trian- 
gulation was quite irregular H1451 plate xviii]. Most surveys also had missing measurements from inaccessible 
stations (e.g. mountaintops) or blocked sight lines. 

As the 19 th century progressed, the growing use of least squares methods created a recurring need to solve 
dauntingly large problems. For example, the British Isles survey had 1554 angles subject to 920 conditions p] 
Of necessity this large problem was broken into smaller subproblems. 

In the principal triangulation of Great Britain and Ireland there are 218 stations, at sixteen of 
which there are no observations, the number of observed bearings is 1554, and the number of 
equations of condition 920. The reduction of so large a number of observations in the manner 
we have been describing [i.e. least squares case [2] would have been quite impossible, and it was 
necessary to have recourse to methods of approximation 

. . . the network covering the kingdom was divided into a number of blocks, each presenting a 
not unmanageable number of equations of condition. One of these being corrected or computed 
independently of the others, the corrections so obtained were substituted (as far as they entered) 
in the equations of condition of the next block, and the sum of the squares of the remaining 
equations in that figure made a minimum. The corrections thus obtained for the second block 
were substituted in the third and so on. Four of the blocks are independent commencements, 
have no corrections from adjacent figures carried into them. The number of blocks is 21: in 9 of 
them the number of equations of condition is not less than 50: and in one case the number is 77. 
These calculations — all in duplicate — were completed in two years and a half — an average 
of eight computers being employed 

In connection with so great a work successfully accomplished, it is but right to remark how 
much it was facilitated by the energy and talents of the chief computer, Mr. James O'Farrell. — 
Clarke [36, pp. 237,243] 

Additional local color can be found in Palmer II 1471 and White 1 1 80 1 . In these situations, professional com- 
puters found it useful to emulate Gauss's calculations of fifty years earlier. 



9. Elimination After Gauss 

"Numerical mathematics stayed as Gauss left it until World War II," concluded Goldstine II8T1 p. 287] 
from his history of the subject. In those 120 years after Gauss [75] there were at best a dozen noteworthy 
publications about solving simultaneous linear equations. The topics were algorithmic simplifications for 
professional computers, and matrix interpretations. These developments overlap chronologically, but they 
did not influence each other until the middle of the 20 th century, so they can be addressed separately in 
sections 



10 and 1 1 respectively, with minimal cross-reference. 



Clarke 1 36 ] notes the conditions were formulated for the sphere using "Gauss's Theorems" and "Legendre's Theorem" from spheri- 
cal trigonometry. The Ordnance Survey 11451 . the research paper by 1 138], the encyclopedia by Jordan 1 1 12], and the textbook by 1 186] 
have more explanations and examples. 

15 Stigler 11641 p. 158] switches the numbers, a Freudian slip no doubt, indicating the difficulty modern mathematicians have in 
comprehending that, originally, the important least squares problems — those governments would create bureaucracies to solve — were 
underdetermined not overdetermined. 
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Figure 3: A triangulation of Holland that Gauss [75 p. 86, art. 23] used to illustrate survey adjustments. He took the data from a French- 
language publication by de Krayenhof that seems to be unavailable. The same data and the map seen here are in the (likely equivalent) 
Dutch-language publication of Krayenhoff 1 118]. Courtesy of the Bancroft Library, University of California, Berkeley. 




Conditions: 
£ 1. vertex 
) 2. triangle 
3. side 



Figure 4: The portion of Figure[5]that Gauss readjusted. He had to find adjustments for 27 angles, but he had only 13 conditions for 
them to satisfy, consisting of 2 vertex conditions, 9 triangle conditions, and 2 side conditions. Jordan 11121 p. 489] reproduces a similar 
figure. 
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The authors discussed under hand computing are Gauss, Doolittle, Cholesky, and Crout. Those under 
matrix theory are Toeplitz, Banachiewicz, Frazer, Dwyer, Jensen, and von Neumann and Goldstine. Since 
a picture is desired of how Gaussian elimination was practiced and developed after Gauss, the categories to 
which authors have been assigned are less important than the decision to highlight a particular work. The 
authors in the first group have been chosen because they demonstrably influenced professional computing 
practice. Those in the second group independently interpreted Gaussian elimination in terms of matrix alge- 
bra. 

10. Perfecting Elimination for Professional Computers 

10.1. Gauss's Convenient Notation 

The triangulation of the great Ordnance Survey H145II and the conditions of its adjustment are preserved in 
detail, but how the calculations were performed is missing. Since, apparently, the details of major calculations 
were not archived, whether and why Gauss's solution method was used — and what it was considered to be — 
must be inferred from sources such as Chauvenet 1311 . There appear to have been three advantages to Gauss's 
own method of solving the normal equations, which has been described here in section [7] First, the bracket 
notation conceptually separated the equations from the arithmetic so the workflow could be addressed. 

By whatever method of elimination is performed, we shall necessarily arrive at the same final val- 
ues of the unknown quantities; but when the number of equations is considerable, the method of 
substitution, with Gauss's convenient notation, is universally followed. — Chauvenet BT1 p. 5 14] 

It may come as some surprise to learn that when Gauss performed Gaussian elimination, he simply listed 
all the numbers in the order he computed them, using his brackets to identify the values: [cd] = 1.13382, 
[cd, 1] = 1.09773, [cd,2] = 1.11063, etc. See Figure[5] Gauss surely took an informed approach to comput- 
ing because he was no academic dilettante. Dunnington ll52l p. 138] echoed Bessel in regretting how much 
time Gauss spent calculating for his interminable survey projects: Gauss himself estimated to have needed 
a million numbers! Second, in contrast to the method of Hammond, Lacroix, and the textbook algorithm of 
equation {j}, Gauss realized economies by avoiding duplicate calculations for symmetric equations f^] 

By means of a peculiar notation proposed by Gauss, the elimination by substitution is carried on 
so as to preserve throughout the symmetry which exists in the normal equations. — Chauvenet 
lf3Tl p. 530] 

Chauvenet even counted the brackets or auxiliaries: 156 for 8 normal equations, etc. Third, Gauss included 
refinements such as estimates of precision, variances, and weights for the unknowns, all expressed in his 
bracket notation. The difficulty of making changes to the formulas, whose complexity was compounded by 
their relationship to Gauss's statistical theories, and the demonstrable benefits of Gauss's ideas, entailed a 
reluctance to alter his computational prescriptions. His efficient method for overdetermined problems by 
itself was conceptually difficult because the solution of the normal equations, AA'u = b, was not the solution 
of the problem, x = A'u. Gauss's bracket notation was still being taught a hundred years after his 1811 paper, 
by Bartlett [12], Johnson H1 1 1I . and 11861 . and for continued emphasis on the advantages of preserving 
symmetry, see Palmer II 1461 pp. 84-85]. At the beginning of the 20 th century, Wright and Hayford find 
professional computers using either of just two methods to solve normal equations: the brackets of Gauss or 
the tables of Doolittle. 

70.2. Doolittle, Legendary Computer 

Myrick Hascall Doolittle was a computer who solved Gauss's normal equations to adjust triangulations at 
the United States Coast and Geodetic Survey for 38 years from 1873 to 1911. See Figure|6] A biographical 
sketch has been written by Grier l89l pp. 78-79], and some primary biographical sources are quoted at length 
by Farebrother |60|. Doolittle was the only human computer remembered for his professional acumen. 



The normal equations are symmetric in the sense that the same coefficient applies to the / variable in the k equation, and vice 
versa. 
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Figure 5: How Gauss performed Gaussian elimination. He wrote down the numbers in the order he computed them, using his bracket 
notation to identify the values. Excepted from Gauss |72] p. 25]. 



Demonstrating the efficacy of his methods, Doolittle ||49l p. 1 17] notes that he solved 41 normal equations 
in a week using paper and pencils. For perspective, Fox ll66l p. 676] reports that four mathematicians, Alan 
Turing among them, needed two weeks to solve 18 equations using electric desk calculators in 1946. Nothing 
suggests Doolittle was a numerical savant because a colleague, Mr. J. G. Porter, duplicated the calculation in 
the same time to check for errors. 

The speed stemmed from streamlining the work for hand computing. Among the practices Doolittle P9l 
described, is identifying the numbers of the calculation by their placement in tables. He presented a small 
numerical example that is restated here in symbolic bracket notation to clarify the calculation. Corresponding 
to Gauss's equation (|7]), Doolittle's normal equations were as follows. 

= [aa] w + [ab]x + [ac]y + [ad]z + [an] 

= [ab]w + [bb]x + [bc]y + [bd]z + [bn] 

= [ac] w + [be] x + [cc] y + [cd] z + [cri] 

= [aii]w + [bd]x + [cd]y + [dd]z + [dn] 

Doolittle expressly arranged the calculation to derive Newton's (back-) substitution formulas for each vari- 
able, which correspond to rearrangements of Gauss's equations A — 0, B — 0, He called these formulas 

the "explicit functions" for the variables. Doolittle was able to co-locate many of the numbers by completing 
the formula for a given variable before undertaking any calculations for the next. He kept the coefficients of 
the formulas in table A, see Figure[7J while he used a second table, B, to record in columns the sums that give 
the values in table A. Exactly how Doolittle conducted the work may be lost. "For the sake of perspicuity," 
he noted, "I have here made some slight departures from actual practice." 
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Figure 6: Myrick Hascall Doolittle, 1830-1911, circa 1862. Courtesy of the Antiochiana collection at the Olive Kettering Library of 
Antioch University. 

The salient feature of Doolittle's tables is a reduction in the labor of division and multiplication. All 
divisions reduce to multiplications through reciprocals formed just one per variable in the first column of 
table A. All multiplications have a single multiplier repeatedly applied to several multiplicands in a row of 
table A and recorded in another row of table B. For example, row 8 in table B results from a single multiplier 
in table A, row 2 column y, applied to several multiplicands in row 1, beginning at the same column, y, and 
moving rightward. The reduction of work occurs because Doolittle ||49l p. 117] performed multiplication 
using the 3-digit tables of Crelle ll40l . Since Doolittle reused the multiplier for an entire row of calculations, 
he could open Crelle's book just once to the table for that multiplier, where all the multiplicands could be 
found without turning pages. Schott Ml 601 p. 93] emphasized that using multiplication tables was innovative 
— "logarithms are altogether dispensed with.'p^] 

Doolittle performed the back-substitution with similar economy. He distinguished between numbers used 
once or many times, so he copied the reciprocals and "explicit function" coefficients from table A to table C; 
see Figure|8] The value of z was available in the final row of table A. The remaining variables were evaluated 
in table D where each row consists of one multiplier applied to one column, this time, of table C. The sums 
of the columns in table D give the other variables. 

Doolittle's method included several contributions which are not now recognized. He owed his speed in 
part to using just 3-digit arithmetic in the multiplication tables, and hence everywhere in the calculation, but 
the rounding errors so introduced would be considered severe. For example, a modern computer adhering 
to international standards for binary arithmetic has the decimal equivalent of roughly 7 or 14 digits 11061 . 
Thus, an important aspect of Doolittle's method was the ability to correct the 3-digit approximate solution 
with comparatively little more work. Since the angle adjustment problem itself corrected numbers that were 
approximately known — the measured angles — it may have seemed natural to further correct the adjust- 



This comment confirms that computers generally did use logarithms to evaluate Gauss's brackets. Note tables of "Gauss's loga- 
rithms" are available for the subtraction. 
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Figure 7: How Doolittle performed Gaussian elimination, transcribed from the numerical example in Doolittle [49 1 using Gauss's bracket 
notation to identify the quantities. The step numbers indicate the order of forming the rows. Each row in table B is a multiple, by a single 
number, of a partial row in table A. The rows 5, 10, and 16 in table A are sums of rows 3-4, 7-9. and 12-15 in table B, respectively. 



ments. Both Doolittle [49 1 and Schott 1 160, p. 93] described the correction process without giving it a name; 
today it is called iterative refinement. 

In describing the refinement process, Doolittle wrote W\, x\, y\, Z\ for the values obtained from tables 
A-D. When these approximations are substituted into the normal equations they give residual values. 

r\ = [aa] w\ + [ab]xi + [ac]yi + [ad\z\ + [an] 

r 2 = [ab]w\ + [bb]xi + [bc]y\ + [bd]z\ + [bn] 

r^ — [ac]wi + [bc]x\ + [cc]y\ + [cd] z\ + [cn] 

r\ — [ad]w\ + [bd]x\ + [ct/]yi + [dd]z\ + [dri] 

Doolittle emphasized this particular calculation needed high accuracy after which the residuals r\, r 2 , r^, r\ 
(notation of this paper) could be rounded to 3 digits for the remaining steps. In table E of Figure|9]he performs 
the same calculations on the residuals that he performed on the constant terms of the normal equations in the 
final column of table B. Table F of this figure then duplicates the back-substitution of table D, but now for the 
corrections, which Doolittle named w 2 , x 2 , y 2 , z 2 - The fully-corrected solutions, w — w t + w 2 , x — X\ + x 2 , 
etc., were accurate to 2 to 3 digits in Doolittle's example. 

Another innovation, "one of the principal advantages," was a provision to include new equations and 
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variables. Wright and Hayford [ 186] pp. 1 17-1 18] provide a clearer text than Doolittle [49 1, whose description 
is somewhat brief. Doolittle suggests large problems could be solved by successively including conditions 
in the minimization problem, that is, by appending equations and variables to the normal equations. He 
recommends ordering the conditions so as to preserve zeroes in the elimination, and suggests an ordering 
based on the geometric interpretation of the conditions. Doolittle thus anticipates the work on sparse matrix 
factorizations that would be done a hundred years later, see George and Liu [77] and Duff et al. Il50l . 

Dwyer [53, p. 112] remarked "from Doolittle down to the present" no formal proof was offered that 
Doolittle's tables do solve the normal equations. Some justification is needed because Doolittle does not 
explicitly reduce to zero the coefficients of eliminated variables. For example, rows 8 and 9 of table B 
remove w and x from the 3 rd equation, but Doolittle operates only on coefficients for the retained variables y 
and z. This saving is possible because, thanks to the underlying symmetry, Doolittle knows the multipliers 
from the omitted calculations are the coefficients of his "explicit functions." Dwyer [54] gave one proof, 
and other explanations could be given, such as expanding Gauss's bracket formulas for the coefficients in his 

equations A = 0, B = 0, The relationship between Gauss's brackets and symmetric elimination seems to 

have been generally known, as evidenced by Chauvenet II3T1 p. 530]. Doolittle had the training and ability to 
develop such methods. He taught mathematics at Antioch College after receiving a Bachelor's degree there, 
he also studied under Benjamin Peirce at Harvard College, and he chaired the mathematics division of the 
Philosophical Society of Washington. 

The reason for Doolittle's reticence may be that publishing computing methods per se was neither appro- 
priate (as judged by the mathematical community) nor desirable (from the standpoint of the computer). The 
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Figure 8: How Doolittle performed back-substitution. Table C he copied from table A. The sum of each column in table D evaluates the 
"explicit function" for a different variable. 
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Figure 9: How Doolittle performed iterative refinement. Table E corresponds to the final column of table B. The calculations in table B 
apply to the constant terms, while in table E they apply to the residuals. Table F is similar in construction to table D. 
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omission of calculating methods from the report of the Ordnance Survey 11451 has been noted, as well as the 
neglect of Gauss's computing methods in the histories of mathematics. Grier [89, p. 156] concluded from 
his study of hand computers that until the 20 th century computing was a craft skill passed from masters to 
apprentices. A reluctance to disclose methods is consistent with Grier's picture of journeymen computers. In- 
deed, Schott Ml 601 p. 93] emphasized Doolittle's paper was written at the express urging of the Coast Survey 
Director. The Washington Star Ml 761 reported Doolittle "contributed numerous papers on his favourite sub- 
ject" to the Philosophical Society of Washington, yet calculating was discussed only in this one of Doolittle's 
three archived publications listed in the bibliography compiled by Gore 11841 p. 365]. 

Doolittle's paper made a strong impression on computers. The claims made for the "Coast Survey 
method" by Doolittle [49 ] and Schott [ 160| caused Werner 11791 to examine it immediately and skeptically. 
He calculated Doolittle's tables three different ways: with Crelle's multiplication tables, with logarithms, 
and with the Thomas Arithmometer, which is discussed below. Thus iterative refinement was misunderstood 
and neglected by later authors. The Doolittle tables evidently passed muster because they were reprinted 
for many years. Jordan 11121 pp. iv, 65] remarked on the "old, classic" notation of Gauss and then exhibited 
without attribution a calculation using logarithms and Doolittle's table Bp^] Wright and Hayford Ml 861 offered 
the Gauss brackets and the Doolittle tables as competing approaches. They did not understand the subtleties 
because they touted the use of Crelle's 3-digit tables (p. 120) yet they omitted the iterative refinement. Grier 
||89l pp. 159-164] relates that Howard Tolley, a computer at the Coast and Geodetic Office who became an 
official at the United States Department of Agriculture, politely scolded economists and statisticians when 
they neglected to credit the computing methods developed in geodesy. Tolley and Ezekiel B173I cited several 
textbooks teaching Doolittle's method. They illustrated what were still essentially Doolittle's tables by a 
calculation that, by then, was done with a Monroe four-function electric calculator, which obviated any need 
for iterative refinement. Tolley and Ezekiel report that Miss Helen Lee, a computer, could solve 5 equations 
with 6-digit numbers (two whole and four fractional) in 50 minutes in 1927 (or 40 minutes if rounded to two 
fractional digits). 

10.3. Mechanical Calculators 

By the time Doolittle wrote his paper, Gottfried Wilhelm Leibnitz's stepped drums had been used to 
build several arithmetic machines. The last machine, and the first one to become commercially available, 
was the Arithmometer of Charles Xavier Thomas de Colmar. It was a four function calculator that operated 
by repeated addition to, or subtraction from, a Pascal adder that served as an accumulator. The drums were 
elongated gears with 9 cogs of different lengths on their surfaces. An axel parallel to each drum held a 
sliding gear positionable to mesh with from to 9 cogs. When the drums made full rotations (powered by 
a hand crank) then each axel made a partial rotation (dependent on the position of its sliding gear). In this 
way each axel, through additional gearing, added a number from through 9 to a corresponding digit of the 
Pascal adder. The accumulator could shift to permit scaling by powers of 10. For multiplication, the number 
represented by the positionable gears became the multiplicand. Repeated rotations of the drums and shifts 
of the accumulator gradually added the complete product to the accumulator. A sum of products could be 
accumulated in this way. Arithmometers were less reliable than could be hoped, yet [ 1 1 1 describes many 
uses for them in Victorian England after about 1870. 

II 1671 reports that mass production of mechanical adders and calculators began at the end of the 19 th 
century. The calculators were three-register machines based on Wilgot Odhner's pinwheel design. The 
principle of repeated addition or subtraction remained the same, while more compact and reliable devices 
replaced the Leibnitz drums and the Pascal adder. The input register represented the digits of a number 
by coaxial wheels with variable quantities of pins protruding at their circumferences. A revolution of the 
pinwheels added the number to an output accumulator consisting of sprocket wheels that operated like a 
Pascal adder, but now rotating on a common axle, and again mounted on a moveable carriage. The multiplier 
was an output register that counted the revolutions at each carriage position to check how many additions 
the operator had performed at each power of ten. For subtraction or division, the hand crank was turned 
backward. 



Jordan's encyclopedic reference work on geodesy was constantly updated. He credited the material to scarcely anyone by name 
except Gauss himself. Althoen and McLaughlin j'4] explain it is this Jordan, from Hanover, for whom the Gauss- Jordan algorithm for 
inverting matrices is misnamed. 
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Figure 10: Andre-Louis Cholesky, 1875-1918, at the start of his student days at the Ecole Polytechnique, 1895-1897. Courtesy of the 
Archives of the Bibliotheque Centrale of the Ecole Polytechnique. 



Electrified calculators were made from about 1920. Collectively called rotary calculators, a motor re- 
placed the hand crank, and more automated designs followed. The multiplier register became an input that 
automated the program of revolutions and carriage shifts^] Since these fully automatic machines only be- 
came available during the Great Depression, first cost and then wartime rationing limited their use, so they 
were a luxury for scientific computers. For example, Grier [89, p. 220] reports that the Mathematical Tables 
Project gave most computers only paper and pencils because the price of an electric calculator nearly equalled 
their annual salary. [30] explains that three variants survived after World War II: Friden, Monroe, and the 
ultra quick and quiet Marchant "silent speed" calculators from Oakland, California. 

The most significant feature of all these machines for computers was the ability to accumulate a sum 
of products. It saved time because the individual products need not be recorded in a table before summing 
them, as Doolittle had done in his table B. Moreover, this feature was believed to result in more accurate 
calculations. The multiplicand and multiplier often had the same capacity, and then the accumulator had 
twice the digits of either. The decimal point locations chosen for the registers might require a number in the 
accumulator to be rounded before it could be reentered. Therefore, Dwyer ||53~1 p. 112] explained, summing 
products without reentry was "theoretically more accurate" because it gave "the approximation resulting from 
a number of operations rather than the combination of approximations resulting from the operations." 



10.4. Cholesky: Machine Algorithm 

Mechanization cannot be why Doolittle revised elimination in 1878 because he began using a manual 
adding machine only in 1890 |89 p. 93]. The first algorithm intended for a machine may be that of the 
military geodesist and World War I casualty Andre-Louis Cholesky. See Figure 11 A biography with a 
discussion of his work has been prepared by [21], see also lEUll . 



"This usage of program by Chase 1301 p. 215] exhibits the original technical meaning: a sequence of events designed to occur in 
complex machinery. Grier [ 88 1 traces the evolution to the verb to program in computer science. 
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Like Doolittle, Cholesky calculated Gauss's angle adjustments, case[2]of the least squares problem, which 
is stated here in section[6] Although matrices were known when Cholesky developed his algorithm, they were 
not used outside pure mathematics, so his invention is better understood without them. Cholesky wrote the 
condition equations as |[T4l eqn. 1], 



(10) 



a\X\ + 02-^2 + 03*3 + ■ ■ ■ + a n x n + K\ — 
b\Xi + £>2*2 + ^3^3 + ■■• + b„x n + K 2 — 

l\ X\ + € 2 x 2 + £3 xt, + ... + C n x n + K p — 
where n > p, and he wrote the normal equations as lfl4l eqn. 5], 

a\A\ + a\A 2 + ... + a\A p + K\ = 
a\X\ + 0^X2 + ... + a%A p + K% = 

(ID 

a\X\ + a 2 p A 2 + ... + a p p A p + K p = 

where a k . = a J k is the sum of products of coefficients in the j th and k th conditions, for example, a P 2 — b\t\ + 
. . . + b n v n . 

Cholesky's remarkable insight was, since many underdetermined systems share the same normal equa- 
tions, for any normal equations there may be some condition equations that can be directly solved more 
easily [ 14 p. 70]. Cholesky found his alternate equations in the convenient, triangular form (introducing new 
unknowns, y in place of x, and new coefficients, ft in place of q)p"] 

P\yi + K x = 

P\yi + Plyi +^2 = 

P\y\ + P\yi +$3'3 +^ 3 = o (12) 

P\yi + P\yi + P\yi + ■■■+ P p P y P + k p = o 

Cholesky discovered the coefficients in equation ( p"2] ) are given by straightforward formulas lfT4l p. 72]. 

P\ = ^a i i -(J3]) 2 -(J^) 2 -...-(J3 i r 1 ) 2 
B i _ a\ +r -P\Pl r -ftPl r -...-fc r f?-\ (0) 

The solution A of the normal equations expresses the solution of the condition equations as a linear combina- 
tion of the (transposed) coefficients in the condition equations. For Cholesky, these combinations become a 
system of equations to be solved for A lfT4l eqn. 7]. 



yi = P\A X +p\A 2 + ... + P p A p 

yi= P\X 2 + ... +p 2 p A p 



y P = Pp^ 



(14) 



Since the original condition equations ( 10 1 and Cholesky's equations ( 12 1 have the same normal equations 



(Hi, the quantities A are the same for both problems. Once obtained, A can be used to evaluate x as usual 



Benoit 1 14 | unfortunately used a for the new coefficients which here has been replaced by /? to more easily distinguish them from 
the original coefficients a. 
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(from the transpose of the original coefficients), thereby solving the original condition equations. There- 
fore Cholesky's method was to form his new coefficients by equation ( 13 1, then to solve ( 12 1 by forward- 
substitution for y, next to solve ( 14 1 by back-substitution for A, and finally to evaluate x. 



X\ — a\ A\ + b\ A2 + ... + l\ A p 
X2 — aiA\ + biAi + ... + liAp 
xi, — 03 A i + bj A2 + ... + (■$ A p 



(15) 



x n = a n A\ + b„A 2 + . . . + i n A p 



In contrast, Gauss's algorithm was to solve ([TTJ by elimination for A, then to evaluate ( 15 1 for x. 

Benoit [ 14] published his colleague's method posthumously. A similar manuscript dated 1910 was found 
among Cholesky's military papers and has recently appeared, Cholesky [32|. French geodesists evidently 
continued to use the method because Benoit thought to publish it in his own format with revised notation and 
a table showing how the calculation was conducted, see Figure 1 1 Benoit evaluated the whole i* column 
of coefficients — /3\, /3 l . +1 , /3 l i+2 , . . . , f3' p — before going on to the next column. The table is more compact 
than Doolittle's tables because much of the intermediate work is not recorded: each y3j. is a sum of products. 
Benoit and Cholesky mentioned using calculating machines to accumulate these sums, and Cholesky specif- 
ically referred to the Dactyle brand of the Odhner design. Cholesky [32 | reported solving 10 equations with 
5 -digit numbers in 4 to 5 hours. 
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Figure 1 1 : How Cholesky may have organized the calculation to solve the normal equations by what is now the Cholesky factorization, 
from the fold-out table of Benoit 1 14 1. The left column and top row are labels; the other positions would be occupied by numbers. The 
coefficients, a, and the constants terms, K, of the normal equations are placed for reference above the diagonal. The coefficients, /?, of 
Cholesky's manufactured condition equations are written below the diagonal. The solution, y, of his condition equations stands in the 
bottom row. The solution, A, of the normal equations is recorded in the right column. Some additional rows and columns for arithmetic 
checks and for accuracy estimates have been omitted. 

Cholesky's method remained obscure, compared to Doolittle's work, for 20 years after the publication by 
Benoit 1141 . Nevertheless, it was used in Nordic Europe during this time. 

The normal equations have been solved by the Cholesky-Rubin method, which offers the advan- 
tage that the solution is easily effected on a calculating machine (Cholesky) and that the most 
probable values and their mean errors are derived simultaneously (Rubin). — Ahlmann and 
Rosenbaum [1 , p. 30] 

Rosenbaum applied Cholesky's method to the case [T] normal equations, so the method was extended to the 
other type of least squares problem sometime between 1924 and 1933. That work has not been previously 
discussed in the historical literature, so evidently some primary literature remains to be identifiedPH Jensen 



21 Thus Brezinski 1201 claims incorrectly that after Benoit [14 ] "a period of 20 years followed without any mentioning of the work." 
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Figure 12: Prescott Durand Crout, 1907-1984, circa 1936. Courtesy MIT Museum. 



111091 p. 22], writing in a Danish geodetic publication, remarked that Cholesky's method "ought to be more 
generally used." Soon thereafter it was independently discovered in matrix form by Dwyer and was put to 
use in the United States. 



10.5. Crout: "each element is determined by one continuous machine operation " 



Prescott Crout, see Figure 12 was a professor of mathematics at the Massachusetts Institute of Technol- 
ogy with an interest in mathematics for electrical engineering. In a paper that is model of brevity, [42] listed 
the gamut of uses for linear equations that had developed after least squares. Crout invented the last algorithm 
specifically for hand-operated calculators. 

Crout [42, p. 1239] explained "the method was originally obtained by combining the various processes 
which comprise Gauss's method, and adapting them for use with a computing machine." He wrote the co- 
efficients in a rectangular matrix (his usage) with the constant terms in the final column. Crout's method 
consisted of few terse rules for transforming the numbers. So much had changed in the application of New- 
ton's elimination rule that all semblance of symbolic algebra had disappeared! Crout's instructions are here 
restated more expansively. 

1. The first column is left unchanged. The first row to the right of the diagonal is divided by the diagonal 
entry. 

2. An entry on or below the diagonal is reduced by the sum of products between entries to the left in its 
row and the corresponding entries above in its column. This calculation is permitted only after all those 
entries themselves have been transformed. 

3. Ditto an entry above the diagonal except it is lastly divided by the diagonal entry in its row. 

4. On completing the above steps, (summarizing further instructions) conduct a back-substitution using the 
coefficients above the diagonal and the final column of transformed constants. 

These steps obviously were intended for calculators that could accumulate sums of products. Like Doolittle 
before him, Crout included instructions to improve the accuracy of the solution by an unnamed iterative 
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refinement process. He concluded the paper with a rigorous proof that the method necessarily solves the 
intended equations. 

The method was quickly adopted by those with access to calculators. [41 1 reprised a version of his paper 
in a series of manuals for a calculator manufacturer. The Marchant Calculating Machine Company [132] 
prepared even more detailed instructions in a subsequent manual. Wilson II 1841 p. 352] particularly recom- 
mended these manuals to academic researchers. Black [17| cited both Crout and Dwyer, who is discussed 
below, in describing how to use calculators for engineering work after World War EL 



11. Enter Matrices 

Von Neumann [ 140 1 cited matrices as an example of the delays between the invention of a mathematical 
idea and its use outside pure mathematics. Hawkins [93, 94 95 1 found that matrix algebra was independently 
invented by Eisenstein (1852), Cayley (1858), Laguerre (1867), Frobenius (1878) and Sylvester (1881) to 
clarify subjects such as determinants, quadratic forms, and the elementary divisors that Weierstrass created to 
study ordinary differential equations. The first application of matrices outside mathematics was Heisenberg's 
matrix (quantum) mechanics in 1925. Through the first half of the 20 th century, most accounts of simultaneous 
linear equations lacked matrix notation, for example see Aitken [2, 3], Crout [ 42ll4*TI . Dwyer [54], Hotelling 
ifToTI Bargmann et al. ED, and MacDuffee |[T30l . 

The relationship between Gaussian elimination and matrix algebra was not obviously useful. The few 
descriptions of computations in terms of matrix algebra in the first half of the 20 th century produced no 
significantly new computational methods. Instead, matrix algebra led to a consolidation of approaches by re- 
vealing that all the several elimination algorithms were trivially related through what is now called triangular 
factorization. This paradigm, in the sense of Kuhn 1119 1. more so than the calculation of equation Q, is what 
"Gaussian elimination" means to computational mathematicians today. 



11.1. Toeplitz: First Triangular Factors 

David Hilbert's study of integral equations inspired his students Erhard Schmidt and Otto Toeplitz to 
promote infinite matrices as a representation for linear operators on function spaces]^] As part of that research, 
Toeplitz Ml 721 used determinants to examine the invertibility of infinite matrices. "If one uses the symbolism 
of matrix calculus (see for example Frobenius)," then any finite symmetric matrix S with all leading principal 
determinants not zero, has a matrix U with 

U'U = S~ l equivalently U^U'' 1 = S (16) 

(original notation) where ' is transposition. In this way Toeplitz first exhibited what now is called the Cholesky 
factor, U , though he left it in the computationally useless form of determinants. It was clear from the 
determinantal formulas that U was a lower triangular matrix. Toeplitz remarked that Lagrange and Gauss had 
such a decomposition for quadratic forms, and he cited Jacobi 1 1 07 1 for a similar decomposition of bilinear 



forms. His equation ( 16 1 appears to be the first expression of any such formula in matrix notation. Moreover, 
the equation may be unique to Toeplitz because his U and S are related by inversion. Taussky and Todd II 16911 
suggested basing a proof on the formulas of Gantmacher ||69l v. 1, p. 39]. 



77.2. Banachiewicz: Cracovian Algorithms 

One path to matrix algebra is remembered in neither mathematical history nor heritage. In addition to the 
motivations in pure mathematics for the algebra of Cayley-Eisenstein-Frobenius-Laguerre-Sylvester, there is 
a motivation in computing for the Cracovian algebra of Tadeusz Banachiewicz. An astronomer and geodesist 
like Gauss, Banachiewicz J6l is still known for determining the orbit of Pluto. Witkowski [185| wrote the 
first of several biographical sketches of this concentration camp survivor. 

Cracovians began as a notational scheme for astronomical calculations and became an arithmetic different 
from Cayley's for matrices of arbitrary dimension. The distinguishing feature of the Cracovian algebra is a 
column-by-column product which is more natural when calculating with columns of figures by hand. 



22 Bernkopf 1 15 p. 330] relates that the infinite matrix approach was shown to be fundamentally inadequate by another Hilbert protege, 
John von Neumann. 
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It must, however, be conceded that in practice it is easier to multiply column by column than 

to multiply row by column It may, in fact, be said, that the computations are made by 

cracovians and the theory by matrices — Jensen II 1091 p. 5] 

Banachiewicz posed least squares problems in terms of Cracovians for the purpose of improving computa- 
tions. See Kocinski 1 1 1711 for descriptions of the algorithms and additional references to the original papers. 
Jensen [ 108 pp. 3, 19] reports hearing Banachiewicz advocate this approach at meetings of the Baltic Geode- 
tic Commission as early as 1933. Banachiewicz JS] |7) independently discovered Cholesky's method, and 
although later than Cholesky, he had greater impact. Banachiewicz inspired the work of Jensen, and he was 
widely cited: by Jensen |[T(J9l p. 45], Dwyer §5\ p. 89], Cassinis (29j p. 78], Bodewig (H part V, p. 90], 
Laderman [124|, and again by Dwyer ll57l p. 103]. Reflecting the thesis of this paper, the influential math- 
ematician Householder 1103111041 p. 142] would neglect the work of Banachiewicz, who traded places with 
Cholesky in mathematical obscurity. 

11.3. Frazer, Duncan, and Collar: Elementary Matrices 

The work of R. A. Frazer, W. J. Duncan, and A. R. Collar exemplifies the source for matrix algebra that 
Hawkins [95 1 finds in Weierstrass's study of dynamical systems. In this case the immediate source was Baker 
|5| with whom Frazer studied. See Pugsley II 1 5211 for a biography of Frazer. Felippa 11631 identifies Frazer, 
Duncan, and Collar as the developers of the finite element method for structural analysis. Their study of 
airframe vibration, flutter, thus translated into questions about matrix eigenvalues. They wrote an influential 
book that explained computations for dynamics in terms of matrices. Frazer's aerodynamics section at the 
National Physical Laboratory also indirectly influenced mathematical research by employing Leslie Fox and 
James Wilkinson 111871 . and Olga Taussky [ 170 1, all of whom became prominent computational mathemati- 
cians after World War II. 

Frazer et al. Il68l pp. 96-99] viewed elimination as "building up the reciprocal matrix in stages by ele- 
mentary operations" which could produce a triangular matrix "such that its reciprocal can be found easily." 
They demonstrated column elimination of a 4 x 4 matrix a (their notation), so they had aMyM^M^, - t where 
the M/ are "post multipliers" and t is "the final triangular matrix." Frazer et al. remarked that M\MiMt, was 
itself a triangular matrix but "opposite-handed" from r. Jensen H 1091 pp. 13-15] borrowed this approach 
to establish the connection between Gaussian elimination and triangular factoring. He restated it in a more 
conventional form of row operators acting on the left but using the same notation M, . Modern textbooks still 
use this exposition to establish the relationship between Gaussian elimination and matrix factoring, although 
they attribute it to neither Frazer et al. nor to Jensen. For example see the text of Golub and Van Loan ||83l 
p. 93] using, remarkably, the same notation M, but now called a "Gauss transformation." Frazer et al. did not 
continue their presentation to the modern conclusion: they neither commented on matrix factoring nor did 
they write out a factorization such as a — t {M\MiM{y x . 



11.4. Dwyer: Abbreviated Doolittle Method and Square Root Method 

Paul Sumner Dwyer was a professor at the University of Michigan and a president of the Institute of 
Mathematical Statistics. See Figure [13] His longtime colleague Cecil Craig [39 1 wrote a short biography. 
Dwyer collaborated with an official at the United States Department of Agriculture named Frederick Waugh. 
The USDA practiced a type of economics, known as econometrics l64l . whose research methodology consists 
of data analysis by, essentially, the method of least squares. Thus, the partnership with Waugh exposed Dwyer 
to computers with government resources, so he could expect calculations to be mechanized, which placed him 
at the forefront of computing practice. 

Dwyer |53] began from a comparison of solution methods that emphasized sources in American and 
English statistics. He considered 22 papers from 1927 to 1939 and suggested their bibliographies should 
be consulted for an even more thorough picture of the subject. Dwyer's review painstakingly uncovered the 
similarities between proliferating methods distinguished by minor changes to the placement of numbers in 
tables. For example, he noted a method of Deming [45 1 was equivalent, except for superficial changes, to the 
"method of pivotal condensation" of Aitken [2], both of which Dwyer included under the rubric "method of 



single division," see Figure 14 The differences among these methods only seem pedestrian until one must 
choose the most effective way to calculate and record all the numbers in the tables by hand. Nevertheless, 
in another paper, Waugh and Dwyer II 1771 summarized the field more succinctly, observing the methods are 
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Figure 13: Paul Sumner Dwyer, 1901-19??, circa 1960s, courtesy Prof. Ingram Olkin of Stanford University. 

more or less the same, except "Crout divides the elements of each row by the leading element while we divide 
the elements of columns." Dwyer credited to others, including Waugh, the observation that accumulating 
calculators made it unnecessary to record the series terms in Doolittle's table B. Dwyer called the streamlined 
procedure the "abbreviated method of single division - symmetric" or the "abbreviated Doolittle method." 





x\ 


X2 


X3 


*4 


r. h. s 






1 


1.0000 


.4000 


.5000 


.6000 


.2000 


original equations 




2 


.4000 


1.0000 


.3000 


.4000 


.4000 






3 


.5000 


.3000 


1.0000 


.2000 


.6000 






4 


.6000 


.4000 


.2000 


1.000 


.8000 




1 => 


5 


1.0000 


.4000 


.5000 


.6000 


.2000 


elimination 


5,2 => 


6 




.8400 


.1000 


.1600 


.3200 




5,3 => 


7 




.1000 


.7500 


-.1000 


.5000 




5,4 => 


g 




.1600 


-.1000 


.6400 


.6800 




6 =» 


9 




1.0000 


.1190 


.1905 


.3810 




9,7=* 


10 






.7381 


-.1190 


.4619 




9,8=> 


11 






-.1190 


.6095 


.6190 




10=* 


12 






1.0000 


-.1612 


.6258 




12, 11 => 


13 


.5903 


.6935 




13 => 


14 








1.0000 


1.1748 


back-substitution 


11,14=* 


15 






1.0000 




.8152 




9, 14,15=* 


16 




1.0000 






.0602 




5,14,15,16=* 


17 


1.0000 








-.9366 





Figure 14: A form of Gaussian elimination that Paul Dwyer [53 ] called "the method of single division" is equivalent but for cosmetic 
changes to the "method of pivotal condensation" of Aitken |2], and to an earlier method of Deming |45 1. The nonunitary numbers in 
rows 5, 9, and 12 are the upper diagonal entries in Crout's table. 
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Dwyer [55 1 independently interpreted Gaussian elimination as matrix factoring. His primary interest was 
the case [T] least squares problem, so beginning from the coefficient matrix A of the normal equations, he 
showed the abbreviated Doolittle method was an "efficient way of building up" some "so called triangular" 
matrices S and T with A — S'T - 0. Dwyer remarked that this formula was the key to "a more general 
theory." For the normal equations it was possible to choose S = T for a "square root" method (modern 
Cholesky method) which Dwyer ll56ll developed in a later paper. He added that he found no other matrix 
algebra interpretation of solving equations except Banachiewicz |9| who, Dwyer reported, also had a square 
root method. 

Dwyer especially influenced computers in the United States. 1 124 1 reported that [51 1 popularized Dwyer's 
square root method, and that it was even used at the Mathematical Tables Project. European mathematicians 
such as Fox [65 1, however, preferred to apply Cholesky's name. Since Dwyer's many papers and his book on 
linear equations (1951) always invoked Doolittle's name, his own name was never attached to either of the 
computing methods that he championed. 



77.5. Jensen (and Bodewig): Literature Surveys 

Two survey papers helped establish the matrix interpretation of Gaussian elimination by describing many 
algorithms independent of origin in a common notation. The geodesist Henry Jensen II 1081 pp. 3, 19] charac- 
terized his work as extending to "matrix symbolism" the results of Banachiewicz for least squares problems 
and for the normal equations. Jensen Ml 091 p. 11] found three algorithms for solving the normal equations 
were similar in that they could be interpreted as "reducing the matrix in question to a triangular matrix:" 
the "Gauss' ian algorithm," the Cracovian method, and Cholesky's methodp] To emphasize the similarities 
among the triangular factoring algorithms, Jensen used pictograms for triangular matrices with zeroes either 
under \l or over K the main diagonal (original terminology and pictures). His primary interest was the nor- 
mal equations with coefficient matrix A* A = N for a rectangular matrix A, where * is transposition. Jensen 
01091 p. 15, eqn. 15; p. 22, eqn. 3] explained that the "Gauss'ian algorithm" amounted to N = and 
Cholesky's method was N - B*B where B = Si, 

What Jensen II 1091 pp. 13-16] called the "Gauss'ian algorithm" was his original synthesis of three dif- 
ferent approaches. He began with his own row-oriented version of Frazer et al.'s transformation of a matrix 
to triangular form, which was explained here in section 11.3 Jensen conducted the transformation symboli- 
cally by using Gauss's brackets as the matrix entries, thereby connecting the transformations to the work of 
Gauss. Although Jensen did not mention Doolittle, he recommended the numbers of calculations should "be 
conveniently tabulated" not in matrices but rather in Doolittle's table B. 

Jensen's survey appears to have been the conduit through which matrix interpretations were communi- 
cated to those who popularized them in computation. The use in modern textbooks of his formulation of 
Frazer et al.'s transformations has been noted. Similarly, Jensen 11091 p. 22] wrote that Cholesky's method 
"ought to me more generally used than is the case. It is due to Cholesky lfl4l and was later indicated by 
Banachiewicz." Brezinski and Wuytak [23 p. 18] relate how researchers at the National Physical Laboratory 
— notably Fox et al. [67 1 and [ 174 1 — learned of Cholesky's method indirectly from Jensen's paper through 
John Todd. 

The mathematician Ewald Bodewig took Jensen's approach to summarize in matrix notation all methods 
for solving linear equations that were known through 1947. His interesting bibliography lay at the end of 
his five-part paper but, like Jensen's, it was not comprehensive because he neglected authors such as Crout, 
Doolittle, and Dwyer. Bodewig 1 18 , part I, pp. 444-450] took Jensen too literally because he called triangular 
matrices left or right to indicate the location of the zeroes instead of the nonzeroes: his "right" was t\ = D r 
and "left" was SI = Di where D stood for Dreiecksmatrix. He repeated Jensen's row version of Frazer et al.'s 
presentation of Gaussian elimination, and he emphasized their summary formula using Jensen's pictograms, 
hJ3 = S. Bodewig followed Jensen [ 109 1 in describing Cholesky's method as 6 = for a symmetric 

(3, where ' is transpose. He remarked Cholesky essentially (?!) showed 21 = D r T>i for any matrix 21, and that 
(in light of t\D = 'SI) the method in this form was effectively Gaussian elimination. 



23 The other, non-triangular methods that Jensen discussed were: solution by determinants, the method of equal coefficients (Gauss- 
Jordan elimination, which he correctly attributes to B. I. Clasen in 1888), Boltz's method, and Kruger's method. 
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77.6. Von Neumann and Goldstine: The Combination of Two Tricks 

John von Neumann (1947) and his collaborator Herman Goldstine were alone, among the first authors 
to describe Gaussian elimination in terms of matrix algebra, to make a nontrivial use of the relationship. 
They established bounds on the errors of matrix inverses, calculated by a computer using a method related 
to Gaussian elimination, in terms of the ratio of the largest to the smallest singular values of the coefficient 
matrix. That result is beyond the scope of the present discussion because it marks the beginning of modern 
research interests in computational mathematics. Von Neumann and Goldstine's ratio is now called the matrix 
condition number. 

They also were the only authors to show exactly how the "traditional schoolbook method" of equation 
([T]l calculates triangular factors of the coefficient matrix. Specifically, von Neumann and Goldstine 11411 
p. 1051] described the elimination algorithm as creating a sequence of ever-smaller reduced matrices and 
vectors from Ax = y (original notation), 



A = A (1) , A< 2) , A« . . . , AW, y = f", T", ■ ■ ■ , T 



y (2) , y (3) , . . . , y in \ 

where the rows and columns of A w and y w are numbered from i to n. For i — the computation is, 



A (i t l) =A% - A«U® /A® for k > i (17) 

j,k j,k j,i i,k' i,i J 7 v 7 

yJ +1) =yf-(A®/A®)y® for;>/. (18) 

Next, the algorithm solves by substitution the equations B'x = z where the entries of B' and z are chosen 
from the reduced matrices and vectors (the first row of each). For a matrix C of the multipliers A® /A® with 



j > i (note the unit diagonal), von Neumann and Goldstine summed equation ( 18 i over i and rearranged to 
give Cz = y. From this equation and B'x = z they concluded CB' = A. 

We may therefore interpret the elimination method as . . . the combination of two tricks: First, 
it decomposes A into a product of two semi-diagonal matrices . . . [and second] it forms their 
inverses by a simple, explicit, inductive process. 

— von Neumann and Goldstine B141I p. 1053] 

Note von Neumann and Goldstine wrote "semi-diagonal" for "triangular." 

Von Neumann and Goldstine found a lack of symmetry in the elimination algorithm because the first 
factor always had l's on its main diagonal. They divided the second factor by its diagonal to obtain B' = DB, 
hence A = CDB which they said was "a new variant" of Gaussian elimination (op. cit., p. 1031) that is now 
written A-LDU, 

**=Ag/A® D«=A® Ifc=AgMg j,k>i, (19) 



where A (! ! are the entries of the reduced matrices given by equation ( 17 i 



What accounts in part for the multiple inventors of elimination algorithms is the triangular matrix decom- 
positions are not unique. The choice of unnormalized factors is limited only by the requirement 

W=^X. (20) 

The factorizations A = L(DU), (D l/2 U)'(D l/2 U), and (LD)U that differently apportion the diagonal are 
known today by the names Doolittle, Cholesky, and Crout, respectively. 



12. Gaussian Attribution in Summary 

In summary of the primary and secondary sources, an algorithm functionally equivalent to Gaussian 
elimination appeared in mathematical texts from ancient China. That algorithm likely did not influence the 
invention of symbolic algebra in Europe, so presumably Gaussian elimination developed there independently. 
A few, exemplary systems of linear equations were solved in textbooks written through 1660, sometimes 
exhibiting and sometimes without the rote elimination of variables that distinguishes Gaussian elimination, 
and without discussing a general approach. Newton 11421 . writing circa 1670, described the successive 
elimination of variables as a rule for solving any simultaneous equations, and he noted this explanation was 
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lacking in contemporary textbooks. The 18 century sees Newton's elimination rule repeated in algorithmic 
form by Hammond 1911 . who emphasized its application to linear equations. Euler [58 1 specifically addressed 
linear equations, and although he noted the possibility of following a fixed procedure to solve them, he 
recommended against it. At the very beginning of the next century, the influential textbook author Lacroix 
HI 21 II described the algorithm in a manner very similar to Hammond, and further, recommends it as a general 
method for solving any simultaneous linear equations. 

Gauss cannot be the source for Gaussian elimination in western mathematics because he was born after 
Newton's and Hammond's publications, and he did not write on the subject until after Lacroix's publica- 
tion. Moreover the canonical version, equation ([TJ, did not appear in papers where Gauss himself discussed 
elimination. Thus the purported Gaussian history, or heritage, and the pedagogically compelling Gaussian 
appellation are simply wrong. If Gauss did not arrive at the elimination process on his own (Euler remarked 
it is the most natural way of proceeding), then as Farebrother [61] alludes, Gauss learnt Gaussian elimination 
from readily available textbooks. 

Nevertheless, the use of Gauss's special notation to solve the least squares normal equations, and his 
stature among astronomers and geodesists, indelibly linked his name to calculations. Whereas in the first half 
of the 19 th century algebra textbooks referred only to elimination, from the second half of the 19 th century 
reference books for astronomy and geodesy always cited Gauss to recommend their least squares calculations. 
His notation at first was regarded as particularly for the normal equations. For example, Chauvenet QT1 p. 
530] had the "elimination of unknown quantities from the normal equations . . . according to Gauss," and 
Liagre M128I p. 557] "1' elimination des inconnues entre les equations du minimum (equations normales)" by 
"les coefficients auxiliaires de Gauss." Astronomy and geodesy at one time accounted for the bulk of the 
linear equations solved professionally, so their terminology may have been considered authoritative. Such 
citations were shortened eventually to an unspecific "Gauss's procedure." Mathematicians misinterpreted 
this usage as attributing ordinary, common elimination to Gauss, but only long after World War II, see Table 
[T] Von Neumann (1947) apparently was the last prominent mathematician simply to write elimination as 
Lacroix 01211 and Gauss [71] had, whereas today we mistakenly write Gaussian elimination. 
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Table 1 : Nomenclature for solving simultaneous linear equations by elimination indicating the gradual evolution to "Gaussian elimina- 
tion" over many years. 



YEAR 


AUTHOR 


USES GAUSS'S BRACKETS NOMENCLATURE 


1728 


Newton 




Transformation 


of two or more Equations into one . . . 








... in order to exterminate 


the unknown Quantities 


1752 


Hammond 




The Method of resolving 


Questions, which contain four Equations, . . . 


1771 


Euler 




Der natiirlichste Weg 


bestehet nun darinn, . . . 


1804 


Lacroix 




elimination 




1805 


Legendre 




par les methodes 


ordinaires 


1809 


Gauss 




per eliminationem 


vulgarem (common elimination) 


1818 


Lacroix 




elimination 




1822 


Euler 




the most natural method 


of proceeding 


1831 


Ross 




continue this series of operations 


until a single equation . . . 


1835 


Davies 




quantities may be eliminated 


by the following rule 


1844 


Clark 




of elimination 


when there are three or more . . . 


1868 


Chauvenet 


/ 


method of substitution 


according to Gauss 


1879 


Liagre 




pour proceder a 1' elimination 


. . . coefficients auxiliaires de Gauss 


1884 


Merriman 




the method of substitution 


due to Gauss 


1888 


Doolittle (not 


✓ 


elimination 


[by] the method of substitution, using 




the Doolittle) 






a form of notation proposed by Gauss 


1895 


Jordan 




Gauss sche 


Elimination 


1896 


Bartlett 


✓ 


method of substitution 


proposed by Gauss 


1905 


Johnson 




method of substitution 


as developed by Gauss 


1906 


Wright and 


✓ 


method of substitution 


introduced by Gauss and the 




Hayford 






Doolittle method 


1907 


Helmert 


✓ 


Algorithmus 


von C. F. Gauss 


1912 


Palmer 


✓ 


Gauss's method 


for the solution of normal equations 


1924 


Benoit 




les methodes ordinaires 


y compris celle de Gauss 


1927 


Tolley and Ezekiel 


the direct method of solution 


developed by Gauss 


1941 


Crout 




Gauss's method 




1941 


Dwyer 


mentions notation 


. . . suggested by Gauss 


1943 


Hotelling 




Doolittle method 




1944 


Jensen 


✓ 


Gauss'ian algorithm 




1947 


Bodewig 




Gausssche Verfahren 




1947 


von Neumann and Goldstine elimination 




1948 


Fox, Huskey, and Wilkinson Gaussian algorithm 




1948 


Turing 




Gauss elimination 


process 


1953 


Householder 




methods of elimination 




1960 


Wilkinson 




Gaussian elimination 




1964 


Householder 




Gaussian elimination 




1977 


Goldstine 




Gaussian elimination 
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